8 research outputs found

    Graph-based exploitation of gene ontology using GOxploreR for scrutinizing biological significance.

    Get PDF
    Gene ontology (GO) is an eminent knowledge base frequently used for providing biological interpretations for the analysis of genes or gene sets from biological, medical and clinical problems. Unfortunately, the interpretation of such results is challenging due to the large number of GO terms, their hierarchical and connected organization as directed acyclic graphs (DAGs) and the lack of tools allowing to exploit this structural information explicitly. For this reason, we developed the R package GOxploreR. The main features of GOxploreR are (I) easy and direct access to structural features of GO, (II) structure-based ranking of GO-terms, (III) mapping to reduced GO-DAGs including visualization capabilities and (IV) prioritizing of GO-terms. The underlying idea of GOxploreR is to exploit a graph-theoretical perspective of GO as manifested by its DAG-structure and the containing hierarchy levels for cumulating semantic information. That means all these features enhance the utilization of structural information of GO and complement existing analysis tools. Overall, GOxploreR provides exploratory as well as confirmatory tools for complementing any kind of analysis resulting in a list of GO-terms, e.g., from differentially expressed genes or gene sets, GWAS or biomarkers. Our R package GOxploreR is freely available from CRAN

    Intense and Mild First Epidemic Wave of Coronavirus Disease, The Gambia.

    Get PDF
    The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic is evolving differently in Africa than in other regions. Africa has lower SARS-CoV-2 transmission rates and milder clinical manifestations. Detailed SARS-CoV-2 epidemiologic data are needed in Africa. We used publicly available data to calculate SARS-CoV-2 infections per 1,000 persons in The Gambia. We evaluated transmission rates among 1,366 employees of the Medical Research Council Unit The Gambia (MRCG), where systematic surveillance of symptomatic cases and contact tracing were implemented. By September 30, 2020, The Gambia had identified 3,579 SARS-CoV-2 cases, including 115 deaths; 67% of cases were identified in August. Among infections, MRCG staff accounted for 191 cases; all were asymptomatic or mild. The cumulative incidence rate among nonclinical MRCG staff was 124 infections/1,000 persons, which is >80-fold higher than estimates of diagnosed cases among the population. Systematic surveillance and seroepidemiologic surveys are needed to clarify the extent of SARS-CoV-2 transmission in Africa

    Analysis of Prognostic Gene Expression Signatures of Breast and Prostate Cancer

    Get PDF
    The advent of high-throughput sequencing and microarray technologies has improved our overall understanding of human health and diseases. Recent developments in ‘omics’ technologies (genomics, transcriptomics, proteomics, and metabolomics) have made it possible to measure tens of thousands of molecular quantities in parallel (in the same experiment). One specific objective of this thesis is to scrutinize a comparatively small subset of such measurements known as biomarkers or signatures relevant for predicting the course of a disease or patient’s outcome. In the context of cancer, the discovery of prognostic biomarkers for predicting cancer progression is an important problem for two reasons. First, such biomarkers may be used to treat patients in a clinical setting. Second, it is thought that investigating the biomarkers themselves would yield novel insights into disease mechanisms and the underlying molecular processes that trigger pathological behavior. The latter assumption is investigated in detail in this dissertation. Specifically, we study this problem for breast and prostate cancer by looking at a large number of previously reported prognostic signatures of breast and prostate cancer based on gene expression profiles. For this, we created a novel gene removal procedure (GRP) that purges all traces of biological meaning of the signatures genes and show that surrogate genes can be found among the remaining genes with better or equivalent prognostic prediction capabilities but distinct biological meaning as the published signature genes. As a result, our findings demonstrate that none of the examined signatures have a sensible biological meaning in terms of disease etiology and are merely black-box models allowing to make predictions of patient outcome but are not capable of offering causal explanations to improve disease understanding

    Limitations of explainability for established prognostic biomarkers of prostate cancer

    Get PDF
    High-throughput technologies do not only provide novel means for basic biological research but also for clinical applications in hospitals. For instance, the usage of gene expression profiles as prognostic biomarkers for predicting, e.g., cancer progression, has found widespread interest. Aside from predicting the progression of patients it is generally believed that such prognostic biomarkers provide also valuable information about disease mechanisms and the underlying molecular processes that are causal for a disorder. However, the latter assumption has been challenged. In this paper, we study this problem for prostate cancer. Specifically, we investigate a large number of previously published prognostic signatures of prostate cancer based on gene expression profiles and show that none of these can provide unique information about the underlying disease etiology of prostate cancer. Hence, our analysis reveals that none of the studied signatures has a sensible biological meaning. Overall, this shows that all studied prognostic signatures are merely black-box models allowing sensible predictions of prostate cancer outcome but are not capable of providing causal explanations to enhance the understanding of prostate cancer.publishedVersionPeer reviewe

    Are There Limits in Explainability of Prognostic Biomarkers? : Scrutinizing Biological Utility of Established Signatures

    Get PDF
    Prognostic biomarkers can have an important role in the clinical practice because they allow stratification of patients in terms of predicting the outcome of a disorder. Obstacles for developing such markers include lack of robustness when using different data sets and limited concordance among similar signatures. In this paper, we highlight a new problem that relates to the biological meaning of already established prognostic gene expression signatures. Specifically, it is commonly assumed that prognostic markers provide sensible biological information and molecular explanations about the underlying disorder. However, recent studies on prognostic biomarkers investigating 80 established signatures of breast and prostate cancer demonstrated that this is not the case. We will show that this surprising result is related to the distinction between causal models and predictive models and the obfuscating usage of these models in the biomedical literature. Furthermore, we suggest a falsification procedure for studies aiming to establish a prognostic signature to safeguard against false expectations with respect to biological utility.publishedVersionPeer reviewe

    Graph-based exploitation of gene ontology using GOxploreR for scrutinizing biological significance

    Get PDF
    Gene ontology (GO) is an eminent knowledge base frequently used for providing biological interpretations for the analysis of genes or gene sets from biological, medical and clinical problems. Unfortunately, the interpretation of such results is challenging due to the large number of GO terms, their hierarchical and connected organization as directed acyclic graphs (DAGs) and the lack of tools allowing to exploit this structural information explicitly. For this reason, we developed the R package GOxploreR. The main features of GOxploreR are (I) easy and direct access to structural features of GO, (II) structure-based ranking of GO-terms, (III) mapping to reduced GO-DAGs including visualization capabilities and (IV) prioritizing of GO-terms. The underlying idea of GOxploreR is to exploit a graph-theoretical perspective of GO as manifested by its DAG-structure and the containing hierarchy levels for cumulating semantic information. That means all these features enhance the utilization of structural information of GO and complement existing analysis tools. Overall, GOxploreR provides exploratory as well as confirmatory tools for complementing any kind of analysis resulting in a list of GO-terms, e.g., from differentially expressed genes or gene sets, GWAS or biomarkers. Our R package GOxploreR is freely available from CRAN.publishedVersionPeer reviewe

    A data-centric review of deep transfer learning with applications to text data

    Get PDF
    In recent years, many applications are using various forms of deep learning models. Such methods are usually based on traditional learning paradigms requiring the consistency of properties among the feature spaces of the training and test data and also the availability of large amounts of training data, e.g., for performing supervised learning tasks. However, many real-world data do not adhere to such assumptions. In such situations transfer learning can provide feasible solutions, e.g., by simultaneously learning from data-rich source data and data-sparse target data to transfer information for learning a target task. In this paper, we survey deep transfer learning models with a focus on applications to text data. First, we review the terminology used in the literature and introduce a new nomenclature allowing the unequivocal description of a transfer learning model. Second, we introduce a visual taxonomy of deep learning approaches that provides a systematic structure to the many diverse models introduced until now. Furthermore, we provide comprehensive information about text data that have been used for studying such models because only by the application of methods to data, performance measures can be estimated and models assessed.publishedVersionPeer reviewe

    Prognostic gene expression signatures of breast cancer are lacking a sensible biological meaning

    Get PDF
    The identification of prognostic biomarkers for predicting cancer progression is an important problem for two reasons. First, such biomarkers find practical application in a clinical context for the treatment of patients. Second, interrogation of the biomarkers themselves is assumed to lead to novel insights of disease mechanisms and the underlying molecular processes that cause the pathological behavior. For breast cancer, many signatures based on gene expression values have been reported to be associated with overall survival. Consequently, such signatures have been used for suggesting biological explanations of breast cancer and drug mechanisms. In this paper, we demonstrate for a large number of breast cancer signatures that such an implication is not justified. Our approach eliminates systematically all traces of biological meaning of signature genes and shows that among the remaining genes, surrogate gene sets can be formed with indistinguishable prognostic prediction capabilities and opposite biological meaning. Hence, our results demonstrate that none of the studied signatures has a sensible biological interpretation or meaning with respect to disease etiology. Overall, this shows that prognostic signatures are black-box models with sensible predictions of breast cancer outcome but no value for revealing causal connections. Furthermore, we show that the number of such surrogate gene sets is not small but very large.publishedVersionPeer reviewe
    corecore